Regularized feature-based maximum likelihood linear regression for speech recognition

نویسنده

  • Mohamed Kamal Omar
چکیده

In many automatic speech recognition (ASR) applications, maximum likelihood linear regression (MLLR), and feature-based maximum likelihood linear regression (FMLLR) are used for speaker adaptation. This paper investigates a possible generalization of FMLLR which addresses the degradation in the performance of ASR systems due to small —possibly time-varying— perturbations of the training and the testing data. We formulate the problem as a regularized maximum likelihood linear regression problem. Based on this formulation, we describe a computationally efficient algorithm for estimating the linear regression parameters which maximize the sum of the log likelihood and the negative of a measure of the sensitivity of the estimated likelihood to these perturbations. This approach does not make any assumptions about the noise model during training and testing. We present several large vocabulary speech recognition experiments that show significant recognition accuracy improvement compared to using the speaker-adapted baseline models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvements to generalized discriminative feature transformation for speech recognition

Generalized Discriminative Feature Transformation (GDFT) is a feature space discriminative training algorithm for automatic speech recognition (ASR). GDFT uses Lagrange relaxation to transform the constrained maximum likelihood linear regression (CMLLR) algorithm for feature space discriminative training. This paper presents recent improvements on GDFT, which are achieved by regularization to t...

متن کامل

Tied and Regularized Conditional Gaussian Graphical Models for Acoustic Modeling in ASR

Most automatic speech recognition (ASR) systems express probability densities over sequences of acoustic feature vectors using Gaussian or Gaussian-mixture hiddenMarkov models. In this chapter, we explore how graphical models can help describe a variety of tied (i.e., parameter shared) and regularized Gaussian mixture systems. Unlike many previous such tied systems, however, here we allow sub-p...

متن کامل

Discriminative adaptation for log-linear acoustic models

Log-linear models have recently been used in acoustic modeling for speech recognition systems. This has been motivated by competitive results compared to systems based on Gaussian models, and a more direct parametrisation of the posterior model. To competitively use log-linear models for speech recognition, important methods, such as speaker adaptation, have to be reformulated in a log-linear f...

متن کامل

Generalized discriminative feature transformation for speech recognition

We propose a new algorithm called Generalized Discriminative Feature Transformation (GDFT) for acoustic models in speech recognition. GDFT is based on Lagrange relaxation on a transformed optimization problem. We show that the existing discriminative feature transformation methods like feature space MMI/MPE (fMMI/MPE), region dependent linear transformation (RDLT), and a non-discriminative feat...

متن کامل

Speech Activity Detection for Noisy Data Using Adaptation Techniques

Automatic detection of speech in audio streams has become an important preprocessing step for speech recognition, speaker recognition, and audio data mining. In many applications, the speech activity detection has to be performed on highly degraded audio streams. We present here our work to address the challenge of speech activity detection for highly degraded channel conditions. We present two...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007